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Observation of clustering of ultra-high energy cosmic rays (UHECR) suggests that they are emitted 
by compact sources. Assuming small (< 3°) deflection of UHECR during the propagation, the 
statistical analysis of clustering allows to estimate the spatial density of the sources h,, including 
those which have not yet been observed directly. When applied to astrophysical models involving 
extra-galactic sources, the estimate based on 14 events with energy E > 10 20 eV gives h* ~ 6 x 
10~ 3 Mps -3 . With increasing statistics, this estimate may lead to exclusion of the models which 
associate the production of UHECR with exceptional galaxies such as AGN, powerful radio-galaxies, 
dead quasars, and models based on gamma ray bursts. 



I. INTRODUCTION 

Recent analysis of arrival directions of ultra-high ener- 
gy cosmic rays (UHECR) reveals groups of events (clus- 
ters) with arrival directions lying within ~ 3°, the typi- 
cal angular resolution of the experiment. The set of 92 
observed events with energy E > 4 x 10 19 eV contains 
7 doublets and 2 triplets M. The small probability of 



chance coincidence, of the order of 10 



-3 



suggests 



that clustering is a result of the existence of compact 
sources. At higher energies, E > 10 20 eV, one doublet 
out of 14 events is observed. 

Compact sources of UHECR are naturally explained 
in astrophysical models where they are associated with 
possible UHECR production sites, such as AGN pi, hot 
spots of powerful radio-galaxies Q , dead quasars |5| and 
gamma-ray bursts (GRB) These models have much 
in common. They assume that primary particles are pro- 
tons; the sources of the observed UHECR have, therefore, 
to lie within the GZK cutoff distance. For energies 
E s» 10 20 eV the GZK radius is i? G ZK ~ 50 Mpc, while 
at E > 2 x 10 20 eV it drops to - 20 Mpc [§. In all 
these models the distribution of sources in space within 
the GZK sphere is close to uniform, while the distribu- 
tion in luminosity does not depend on space and peaks 
around a certain value. 

An important common feature of these models is a 
small local density of sources. The number density of 
dead quasars is estimated as ~ 10 -4 Mpc~ 3 Q; the num- 
ber of AGN is ~ 10% of the number of galaxies [H , which 
gives ~ 5 x 10 -4 Mpc -3 . Most likely, only a small fraction 
of them is capable of producing UHECR with energies 
E > 10 20 eV. In the case of GRB the effective density of 
sources is determined by the rate 7 of GRB and the typi- 

yr 

_1 rfTTli 

density of sources ~ 10 -5 Mpc -3 . 

The purpose of this letter is to show that the observed 
clustering favors larger density of sources, provided the 
propagation of UHECR with energy E > 10 20 eV is not 



cal time delay r of UHECR particles. Taking r <, 10 5 
and the rate 7~2x 10 -10 /i 3 Mpc -3 yr -1 M gives the 



strongly affected by extra-galactic magnetic fields. The 
latter assumption is justified if the existing bound on 
extra-galactic magnetic field B <, 10 -9 G Q is valid. 



II. STATISTICS OF CLUSTERING 

The observable quantities which characterize cluster- 
ing are N m , the expected numbers of clusters of differ- 
ent multiplicities m (e.g., N\ and N2 are the expect- 
ed numbers of single and double events, respectively). 
They depend on the total exposure of the experiment B 
and the distribution of sources in the flux they produce^, 
n(F), which is defined in such a way that the number of 
sources with the flux from F to F + dF is dS — n(F)dF. 
The events which come from the same source at differ- 
ent times are statistically independent and therefore have 
the Poisson distribution. Thus, the expected number of 
clusters is 



N m = 



(^)™ 



(F)dF. 



(1) 



This equation implies that the expected total number of 
events N tot is 



Ntot = mN " 



B / Fn(F)dF = BF U 



(2) 



as it should be. The probability to observe k clusters of 
multiplicity m is also given by the Poisson distribution, 



Pm{k) 



-N„ 



k\ 



(3) 



*Here and below we mean the integral flux of cosmic rays 
with energies above some energy threshold. It measures the 
average number of events per unit time per unit area of the 
detector. 
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Any model of UHECR can be characterized by the dis- 
tribution of sources in distance and luminosity f(r,L) 
(in the case of anisotropic distribution it should be 
understood as average over the sphere, f(r,L) = 
J /(r, L)dVt/An)). In order to express n(F) and N m in 
terms of the distribution function f(r,L), consider the 
sources at distances from r to r + dr. The number of 
such sources with luminosities from L to L + dL is 



dS = f{r,L)4irr 2 drdL. 



(4) 



Making use of the relation F = L/Anr 2 and integrating 
over r one finds dS — n(F)dF, where 



n(F) = (4tt) 2 / drr 4 f(r,4irr 2 F). 



(5) 



Here we have neglected the curvature effects since they 
are small at distances of order 50 Mpc. 

In the case of the astrophysics! models, the distribution 
function f(r, L) is uniform in space and depends only on 
the luminosity. The GZK effect, however, makes distant 
sources fainter by a factor exp(— r/R). This is equivalent 
to substituting 



f(r,L) 



h(Le r/R ) 



(6) 



in Eq.([s]), where h(L) is the distribution of sources in 
the luminosity. The exact value of R can be determined 
by numerical simulations of UHECR propagation with 
full account of the energy dependence. For protons with 
E > 10 20 eV the simulations give R ~ 25 Mpc O. 



III. NUMBER OF SOURCES 

A key parameter which enters the distribution f(r,L) 
is the normalization, or the spatial density of sources, 
which can be characterized by the number of sources <S* 
within the sphere of a radius R. An important informa- 
tion about S can be obtained from statistical analysis of 
clustering even if the functional form of the distribution 
f(r,L) is not known. The idea is to find the distribu- 
tion f(r, L) which corresponds to the minimum number 
of sources S 1 * with total number of events, A to t, and the 
number of events in clusters, A c i = N tot — N±, being fixed. 
We will show in a moment that in the case N c \ <C At ot 
the number S* is surprisingly large, much larger than the 
number of the sources already observed. 

It is intuitively clear why in the case A c i <C Atot the 
number of sources is much larger than A to t [[13| . In order 
to produce ~ A to t single events by ~ A to t sources each of 
them has to be bright enough. But then a large number 
of doublets would be produced as well. Since this is not 
the case, i.e. most of the resolved sources are dim and 
produce at most one event, one concludes that there is 
a large number of sources which have not yet revealed 
themselves. Assuming that all sources have the same 
flux F one finds from Eq. (@) Ni ~ Sn and A 2 ~ Sn 2 /2, 



where n = FB is the average number of events produced 
by one source. Therefore, S ~ N%/(2N 2 ) ~ A t 2 ot /A cl , 
i.e. much larger than A to t- Using methods described 
in the Appendix it is possible to show that the case of 
equal fluxes corresponds to the absolute minimum of S. 
However, this distribution is unphysical. Many realistic 
situations correspond to a homogeneous distribution of 
sources in space when more distant sources are fainter; 
consequently, their number has to be even larger than 
predicted by the above estimate. 

In astrophysical models the distribution /(r, L) is given 
by Eq. (o) containing one unknown function h{L). The 
minimum density of sources is determined by minimizing 
over h(L). As is shown in the Appendix, the minimum 
is reached at the delta-function distribution 



h(L) = h*6(L-L*), 



(7) 



where L» is the luminosity of the sources and /i* is their 
spatial density. The unknown parameters h* and L* can 
be related to A to t and N c \ by making use of Eqs.(|) and 
(0). Introducing the notations 



5* = {Att/3)R 3 K, 
= BL»/(47ri? 2 ), 



(8) 



where is the expected number of events from one 
source at the distance R in the absence of the GZK cutoff, 
one has the following equations, 



Atot — SS^is* , 



(9) 



POO 

Ni = 35*2/* / dxexp (-x - v*x~ 2 e~ x ) . (10) 

These equations can be solved perturbatively at small 
Nd < AW One finds 



1 A 2 



*N 2 ot : 



3 A 2 ' 



(11) 



(12) 



If N c i <C Atot , the minimum number of sources S* is in- 
deed much larger than A tot and, therefore, is much larger 
than the number of sources already observed. From Eq. 
(pd|), each source produces much less than 1 event in av- 
erage. 



IV. DISCUSSION 

Let us apply these arguments to the observed events 
with energies E > 10 20 eV. In this case, Atot = 14 and 
A c i = 2. The solution to Eqs.@ and (0) is S* - 400, 
which at R = 25 Mpc corresponds to the density 



6 x 10 



-3 



Mpc" 



(13) 
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This number is large as compared to the density of 
sources in most of astrophysical models. However, it 
should be interpreted with care. One may expect large 
statistical fluctuations because both N to t and N c \ are 
small and may not coincide with their expected values. 

In order to address this issue quantitatively let us 
find the model which has the largest probability p(h*) 
to reproduce the observed clustering at fixed density of 
sources h*. To this end, consider the set of models which 
are described by Eqs.(^) and (0) and are characterized by 
two parameters /i* and L*. At fixed density of sources, 
there remains a freedom of changing L* . The probability 
to reproduce the observed data is maximum for some L„\ 
this probability is p(h^). By construction, there are no 
models with the density of sources smaller than /i*, in 
which the probability to reproduce the observed data is 
larger than p(h*). 



proba- 


14 events 


30 events 


60 events 


bility 


1 doublet 


1 doublet 


1 doublet 


P 


/i* 


V 


/i* 


V 


/i* 


V 


0.1 


23 


0.51 


320 


0.065 


3100 


0.012 


0.01 


3.2 


5.7 


63 


0.38 


690 


0.058 


0.001 






21 


1.3 


260 


0.15 



TABLE I. Minimum density of sources h„ in the units of 10~ 5 
Mpc -3 , and corresponding source luminosity in the units of 
v — BL,/(4tvR 2 ), which are required to reproduce the ob- 
served clustering with given probability p for the real exper- 
imental data (1 doublet out of 14 events) and for two hypo- 
thetical data sets with larger number of events (one doublet 
out of 30 events and one doublet out of 60 events). 

There is some ambiguity in defining what is "to repro- 
duce the observed data" . In the case at hand we request 
that the number of singlets is 12 or larger, the number 
of doublets is 1 or smaller, and the number of clusters 
with the multiplicity 3 and larger is zero. Eq. ([}]) de- 
termines the probability p(h*, L») of such clustering as a 
function of two parameters /i* and L*. The probability 
p(/i*) is found by maximizing p(h*,L„) at fixed h„. We 
have performed this calculation numerically. The results 
are summarized in Table 1 in the form of lower bounds on 
the density of sources. We also present the source lumi- 
nosity in the units of v = BL*/(4irR 2 ), i.e. the number 
of events from a single source at the distance R. 

The models where the observed clustering occurs with 
probability 1% have minimum ~ 2 sources inside the 25 
Mpc sphere. In the latter case most of the observed 14 
events are produced by the sources which are further than 
25 Mpc and thus have to be bright enough. This is re- 
flected in Table 1 from which we see that these sources 
would produce in average ~ 6 events each (in the absence 
of the GZK cutoff) if placed at 25 Mpc. 

It is worth noting that the numbers of Table 1 corre- 
spond to the extreme situation when the distribution of 
sources is given by Eq. (Q) with a particular value of L„ . 



In realistic models, the distribution of sources in lumi- 
nosity is usually spread over an order of magnitude at 
least. There may also be constraints on the luminosity 
of the sources. In these cases, the lower bounds on the 
number of sources are higher than in Table 1. 

When the new large-area detectors like the Pierre 
Auger array []l4|| will start operating, the number of ob- 
served events will increase and the statistical errors in 
determination of the density of sources will go down. 
Correspondingly, the lower bounds on the density will 
become higher. To show that the bounds may become 
very high when the total number of events is still small, 
we have performed calculations for two hypothetical sit- 
uations, 1 doublet out of 30 events, and 1 doublet out 
of 60 events. The results are also listed in Table 1. The 
bounds grow roughly like cube of the number of events, 
in agreement with Eq.(|l2|). 

To summarize, the statistical analysis of clustering 
may provide tight constraints on astrophysical models 
of UHECR when the number of clusters is small. In this 
situation, a key quantity is the density of sources which 
can be bound from below in a model-independent way. 
The bound grows very fast with the number of single 
events above E — 10 20 eV and is potentially dangerous 
for astrophysical models which associate production of 
UHECR with GRB or exceptional galaxies such as AGN, 
powerful radio-galaxies and dead quasars. 

Our method equally applies to models in which UHE- 
CR are produced in the Galactic halo, or in which pri- 
mary particles are immune to the background radiation. 
The relation (|l^) remains valid with a different numeri- 
cal coefficient of order one and different meaning of S*. 
In the first case S 1 * is the number of sources in the ha- 
lo and detailed analysis shows that statistical properties 
of clustering of UHECR are compatible with dumpiness 
of super-heavy dark matter in decays of which UHECR 
may be produced. In the second case our method counts 
the number of UHECR sources within the cosmological 
horizon, which is inaccessible by other means. 
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APPENDIX: MINIMUM NUMBER OF SOURCES 

Consider the problem in general terms. First note that 
by changing the integration variable in Eq. (||) one can 



3 



show that any distribution is equivalent to a factorizable 
one. So, let us take the distribution of sources in the 
form 

f(r,L)=g(r)h(L). (14) 
Let us fix g(r) and minimize the number of sources 

S = 4n r 2 g(r)dr / h(L)dL 
Jo Jo 

with respect to the distribution h{L) under the con- 
straints fixing JVtot and N\, 



(15) 
(16) 



B / g(r)dr / Lh{L)dL = N tot , 
Jo Jo 

poo / J^J^ \ 

BJ drdL Lh(L)g(r)exp^--^j = Ni. 
This is equivalent to minimizing the functional 

poo poo if r>r 

W = 4tt J dLh(L) J g{r)drl r 2 + A— 



BL 




AiVtot + fiNi 



(17) 



with respect to h(L). Here A and pL are the Lagrange 
multipliers. 

The functional ([It]) is linear in h(L); denote the coef- 
ficient by G(L), 



G(L) = jf g{r)dr^r 2 + A^-^exp 



BL 

Anr 2 



At those values of L where G(L) is negative, the min- 
imum of W is at h(L) — » oo. The latter, however, is 
not compatible with Eqs.([l5|) and (|l^). Therefore, at 
the minimum the values of A and pi have to be such that 
G(L) is non-negative. 

At those values of L where G(L) is positive, the min- 
imum of W is reached at h(L) = 0. If G(L) is positive 
at all L, then h(L) is identically zero and Eqs.(15) and 
( |l6| ) are again violated. Therefore, A and pL must be such 
that G(L) touches zero at some L*. The function h(L) 
is non-zero only at this point. Thus, the minimum num- 
ber of sources corresponds to the situation when all of 
them have the same luminosity L*, and we arrive at the 
delta-function distribution, Eq. ((?]). 

It remains to show that for a given positive function 
g(r) satisfying J g(r)dr < oo the Lagrange multipliers A 
and pi can always be chosen in such a way that G(L) is 
positive everywhere except an isolated point. To this end 
rewrite G(L) in the following form, 



where C = J r 2 g{r)dr is a positive constant and the func- 
tion F(L) depends only on the ratio fi/X, 



F(L) 



BL 

47T 



g{r)dr \ 1 - - exp ( - 



BL 



irrr 2 ) 



The behavior of the function F(L) is the following. At 
L — > it goes to zero. At small L it is negative if 
fi/X > 1 and positive otherwise. At L — > oo it grows 
linearly with L, the coefficient being B/4tt j g(r)dr > 0. 
Therefore, at /i/A > 1 the function F(L) must have an 
absolute minimum at some L* > (which is a function 
of /i/A). Then it is clear from Eq.dlq) that by choosing 
A = — C/F(L*) > one can set G{L) to zero in that par- 
ticular point. The argument can be easily generalized to 
the case of infinite number of sources, J g{r)r 2 dr = oo. 

In order to apply this argument to the case of astro- 
physical models, one should find the factorizable distri- 
bution f(r,L) which produces the same n(F) as Eq.(||). 
This can be done by substituting Eq.(||) into Eq.(||) and 
changing the integration variable according to 



1 exp(r/i?) 



(19) 



The result reads 



f(x,L)=g(x)h(L), 

where 

g{x) = (l+r{x)/2R)- 1 c- 3r ^ x ^ 2R 
and r(x) is defined by Eq.(p9h. 
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